iMAP: Discovering Complex Semantic Matches between Database Schemas

نویسندگان

  • Robin Dhamankar
  • Yoonkyong Lee
  • AnHai Doan
  • Alon Halevy
  • Pedro Domingos
چکیده

Creating semantic matches between disparate data sources is fundamental to numerous data sharing efforts. Manually creating matches is extremely tedious and error-prone. Hence many recent works have focused on automating the matching process. To date, however, virtually all of these works deal only with one-to-one (1-1) matches, such as address = location. They do not consider the important class of more complex matches, such as address = concat(city,state) and room-price = room-rate * (1 + tax-rate). We describe the iMAP system which semi-automatically discovers both 1-1 and complex matches. iMAP reformulates schema matching as a search in an often very large or infinite match space. To search effectively, it employs a set of searchers, each discovering specific types of complex matches. To further improve matching accuracy, iMAP exploits a variety of domain knowledge, including past complex matches, domain integrity constraints, and overlap data. Finally, iMAP introduces a novel feature that generates explanation of predicted matches, to provide insights into the matching process and suggest actions to converge on correct matches quickly. We apply iMAP to several real-world domains to match relational tables, and show that it discovers both 1-1 and complex matches with high accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Management of Xml Data by Means of Schema Matching

The eXtensible Markup Language (XML) has emerged as a de facto standard to represent and exchange information among various applications on the Web and within organizations due to XML’s inherent data self-describing capability and flexibility of organizing data. As a result, the number of available (heterogeneous) XML data is rapidly increasing, and the need for developing high-performance tech...

متن کامل

Finding nontrivial semantic matches between database schemas

Finding nontrivial semantic matches between database schemas 3 Summary Automation of schema matching has been under investigation for already some decades, still the systems usually do not find all matches or suggests incorrect matches. Due to this imperfection matching schemas it is still often done manually by domain experts. The rapidly increasing number of heterogeneous and distributed data...

متن کامل

Data-Driven Schema Matching in Agricultural Learning Object Repositories

As the wealth of structured repositories of educational content for agricultural object is increasing, the problem of heterogeneity between them on a semantic level is becoming more prominent. Ontology matching is a technique that helps to identify the correspondences on the description schemas of different sources and provide the basis for interesting applications that exploit the information ...

متن کامل

Discovering Simple Mappings Between Relational Database Schemas and Ontologies

Ontologies proliferate with the growth of the Semantic Web. However, most of data on the Web are still stored in relational databases. Therefore, it is important to establish interoperability between relational databases and ontologies for creating a Web of data. An effective way to achieve interoperability is finding mappings between relational database schemas and ontologies. In this paper, w...

متن کامل

Class Structures and Lexical Similarities of Class Names for Ontology Matching

Semantic Interoperability is a major issue for National Spatial data Infrastructures (NSDIs) and mapping across heterogeneous databases is essential for such interoperability. Mapping of schemas based on ontology mapping provides opportunities for semantic translation of schemas elements and hence for database queries across heterogeneous sources. Such semantics based mappings are usually human...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004